Acquisition of Ultrasound, Video and Acoustic Speech Data for a Silent-Speech Interface Application

نویسندگان

  • T. Hueber
  • G. Chollet
  • B. Denby
  • M. Stone
چکیده

This article addresses synchronous acquisition of high-speed multimodal speech data, composed of ultrasound and optical images of the vocal tract together with the acoustic speech signal, for a silent speech interface. Built around a laptop-based portable ultrasound machine (Terason T3000) and an industrial camera, an acquisition setup is described together with its acquisition software called Ultraspeech. The system is currently able to record ultrasound images at 70 fps and optical images at 60 fps, synchronously with the acoustic signal. An interactive inter-session re-calibration mechanism which allows recording of large audiovisual speech databases in multiple acquisition sessions is also described.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Continuous-speech phone recognition from ultrasound and optical images of the tongue and lips

The article describes a video-only speech recognition system for a “silent speech interface” application, using ultrasound and optical images of the voice organ. A one-hour audiovisual speech corpus was phonetically labeled using an automatic speech alignment procedure and robust visual feature extraction techniques. HMM-based stochastic models were estimated separately on the visual and acoust...

متن کامل

Development of a silent speech interface driven by ultrasound and optical images of the tongue and lips

This article presents a segmental vocoder driven by ultrasound and optical images (standard CCD camera) of the tongue and lips for a “silent speech interface” application, usable either by a laryngectomized patient or for silent communication. The system is built around an audio–visual dictionary which associates visual to acoustic observations for each phonetic class. Visual features are extra...

متن کامل

Statistical Mapping Between Articulatory and Acoustic Data for an Ultrasound-Based Silent Speech Interface

This paper presents recent developments on our “silent speech interface” that converts tongue and lip motions, captured by ultrasound and video imaging, into audible speech. In our previous studies, the mapping between the observed articulatory movements and the resulting speech sound was achieved using a unit selection approach. We investigate here the use of statistical mapping techniques, ba...

متن کامل

Phone recognition from ultrasound and optical video sequences for a silent speech interface

Latest results on continuous speech phone recognition from video observations of the tongue and lips are described in the context of an ultrasound-based silent speech interface. The study is based on a new 61-minute audiovisual database containing ultrasound sequences of the tongue as well as both frontal and lateral view of the speaker’s lips. Phonetically balanced and exhibiting good diphone ...

متن کامل

An Acoustic Study of Emotivity-Prosody Interface in Persian Speech Using the Tilt Model

This paper aims to explore some acoustic properties (i.e. duration and pitch amplitude of speech) associated with three different emotions: anger, sadness and joy against neutrality as a reference point, all being intentionally expressed by six Persian speakers. The primary purpose of this study is to find out if there is any correspondence between the given emotions and prosody patterning in P...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009